*** Replication for: Gemenis, K. (2018). The Impact of Voting Advice Applications on Electoral Turnout: Evidence from Greece.

*** Hellenic Voter Study

*** Place the 2012_voterstudy.dta file in your working directory in Stata. This file contains only the variables used in the analysis.
*** Alternatively, you can download the full dataset, and convert it to a .csv or .dta file, from here: http://www.doi.org/10.3886/E100022V6

use 2012_voterstudy.dta

*** Creating/recoding the female variable

gen female=d2
recode d2 (2=1) (1=0) (9=.)

*** Creating/recoding the age variable

recode d1b (9999=.)
gen age=2012-d1b

*** Creating/recoding the education variable
*** The original response categories are recoded as follows
*** Without degree (1) primary school (2) secondary (3) = 1
*** Highschool (4) = 2
*** post-secondary non-tertiary (5) short cycle tertiary (6) = 3
*** Bachelor (7) = 4
*** Master (8) PhD (9) = 5

gen education=d3
recode education (2=1) (3=1) (4=2) (5=3) (6=3) (7=4) (8=5) (9=5) (96=.) (97=.)

*** Creating/recoding the political interest variable

gen interest=e21
recode interest 7=.

*** Creating/recoding the left-right self-placement variable

gen leftright=q11a
recode leftright (95=.) (97=.) (98=.) 

*** Creating/recoding the VAA usage variable

gen vaa_use=e13a
recode vaa_use (2=0) (3=0)

*** Creating/recoding the VAA awareness variable

gen vaa_aware=e13a
recode vaa_aware (2=1) (3=0)

*** Creating/recoding the self-reported turnout for June 2012 variable

gen turnout_june=q5lh_a
recode turnout_june (5=0) (7=.)

*** Creating/recoding the self-reported turnout for May 2012 variable

gen turnout_may=q6a
recode turnout_may (5=0) (7=.)

*** Generating the covariance weights using entropy balancing

ebalance vaa_use interest leftright female age education, target(3 3 1 3 3) basewt(weights) wttreat

*** Estimating the odds ratio using linear regression (May 2012)

logit turnout_may i.vaa_use [pweight=_webal], or

*** Estimating the difference in propability (May 2012)

margins, dydx(vaa_use)

*** Estimating the odds ratio using linear regression (June 2012)

logit turnout_june i.vaa_use [pweight=_webal], or

*** Estimating the difference in propability (June 2012)

margins, dydx(vaa_use)

clear

*** CAICG Wave 1 & Wave 2

*** Place the 2015_wave1.dta and 2015_wave2.dta files in your working directory in Stata. This file contains only the variables used in the analysis.
*** Alternatively, you can download the full dataset from here: http://www.doi.org/10.7802/1594

*** CAICG Wave 1 

use 2015_wave1.dta

*** Creating/recoding the female variable

gen female_1=d1_1
recode female_1 (2=1) (1=0)

*** Creating/recoding the age variable

gen age_1=2015-d3_1

*** Creating/recoding the education variable
*** The original response categories are recoded as follows
*** Without degree (1) primary school (2) secondary (3) = 1
*** Highschool (4) = 2
*** Technical school (5) Vocational (6) = 3
*** University (7) = 4
*** Postgraduate (8) = 5

gen education_1=d4_1
recode education_1 (2=1) (3=1) (4=2) (5=3) (6=3) (7=4) (8=5) (99=.)

*** Creating/recoding the political interest variable

recode q1_1 99=.
gen interest_1=q1_1

*** Creating/recoding the left-right self-placement variable

recode q2_1 (99=.) (50=.)
gen leftright_1=q2_1

*** Creating/recoding the VAA awareness variable

gen vaa_aware_1=q39_1
recode vaa_aware_1 (2=0) (99=.)

*** Creating/recoding the VAA usage variable

gen vaa_use_1=q40_1
recode vaa_use_1 (2=0) (99=.)
recode vaa_use_1 .=0 if vaa_aware_1==0

*** Creating/recoding the self-reported turnout for January 2015 variable

gen turnout_jan_1=q37_1
recode turnout_jan_1 (2=0) (8=.) (99=.)

*** Generating the covariance weights using entropy balancing

ebalance vaa_use_1 interest_1 leftright_1 female_1 age_1 education_1, target(3 3 1 3 3)

*** Estimating the odds ratio using linear regression (January 2015)

logit turnout_jan_1 i.vaa_use_1 [pweight=_webal], or

*** Estimating the difference in propability (January 2015)

margins, dydx(vaa_use_1)

clear

*** CAICG Wave 2

use 2015_wave2.dta

*** Creating/recoding the female variable

gen female_2=d1_2
recode female_2 (2=1) (1=0)

*** Creating/recoding the age variable

gen age_2=2015-d3_2

*** Creating/recoding the education variable
*** The original response categories are recoded as follows
*** Without degree (1) primary school (2) secondary (3) = 1
*** Highschool (4) = 2
*** Technical school (5) Vocational (6) = 3
*** University (7) = 4
*** Postgraduate (8) = 5

gen education_2=d4_2
recode education_2 (2=1) (3=1) (4=2) (5=3) (6=3) (7=4) (8=5) (99=.)

*** Creating/recoding the political interest variable

gen interest_2=q1_2
recode q1_2 99=.

*** Creating/recoding the left-right self-placement variable

recode q2_2 (99=.) (50=.)
gen leftright_2=q2_2

*** Creating/recoding the VAA awareness variable

gen vaa_aware_2=q27_2
recode vaa_aware_2 (2=0) (99=.)

*** Creating/recoding the VAA usage variable

gen vaa_use_2=q28_2
recode vaa_use_2 (2=0) (99=.)
recode vaa_use_2 .=0 if vaa_aware_2==0

*** Creating/recoding the self-reported turnout for January 2015 variable

gen turnout_sept_2=q31_2
recode turnout_sept_2 (2=0) (8=.) (99=.)

*** Generating the covariance weights using entropy balancing

ebalance vaa_use_2 interest_2 leftright_2 female_2 age_2 education_2, target(3 3 1 3 3)

*** Estimating the odds ratio using linear regression (September 2015)

logit turnout_sept_2 i.vaa_use_2 [pweight=_webal], or

*** Estimating the difference in propability (September 2015)

margins, dydx(vaa_use_2)

clear

*** Choose4Greece

*** Place the c4g_main.dta file in your working directory in Stata. This file contains only the variables used in the analysis.
*** Alternatively, you can download the full dataset from here: http://www.doi.org/10.7802/1758
*** In addition, place the c4g_matches.dta file in your working directory in Stata.

use c4g_main.dta

*** Cleaning

drop if less2>0
drop if less3>3
drop if totaltime<121
drop if totaltime>5399
drop if maxsuccessiveequalanswers>10
drop if noopinions>10
drop if attempts>1

*** Creating/recoding the mobilization dependent variable

recode voteprobability (-999=.) (-998=.)
recode voteprobability_optin (-999=.) (-998=.)
gen mobilization=voteprobability_optin-voteprobability

*** Creating/recoding the age variable

recode dob (-999=.) (-998=.)
gen age=2014-dob
replace age=. if age<18
replace age=. if age>80

*** Creating/recoding the female variable

recode sex (-999=.) (-998=.)
gen female=sex
recode female (0=1) (1=0)


*** Recoding the education variable
*** The original response categories are recoded as follows
*** Not completed primary (1) lower secondary (2) = 1
*** Upper secondary (3) = 2
*** technical/vocational (4) = 3
*** University (5) = 4
*** Post-graduate (6) = 5

recode education (-999=.) (-998=.)
recode education (2=1) (3=2) (4=3) (5=4) (6=5)

*** Creating/recoding the political interest variable

recode interestinpolitics (-999=.) (-998=.)
gen interest=interestinpolitics


*** Import and merge into the dataset the matches (hybrid algorithm) data

merge 1:1 id using "c4g_matches.dta"


*** Generating the information variable (standard deviation of the matches with 10 parties)

egen information = rowsd(*_hy)

*** Generate percent matches for Figure 1

gen nd_hy_pc=nd_hy*100
gen syriza_hy_pc=syriza_hy*100
gen anel_hy_pc=anel_hy*100
gen pasok_hy_pc=pasok_hy*100
gen gd_hy_pc=gd_hy*100
gen kke_hy_pc=kke_hy*100
gen potami_hy_pc=potami_hy*100
gen kidhso_hy_pc=kidhso_hy*100
gen antarsya_hy_pc=antarsya_hy*100
gen dimar_hy_pc=dimar_hy*100


*** high information example Figure 1

graph bar *_hy_pc if information>.709&information<.774, hor legend(off)

*** low information example Figure 1

graph bar *_hy_pc if information>.050077&information<.050079, hor legend(off)


*** prepare data to export for analysis in R

drop if _merge==2
keep id age female education interest mobilization information
export delimited using "2015_c4g.csv", replace


